Mental Images of Text: Learning Document Similarity using Web Photos
نویسندگان
چکیده
Modern search engines rely solely on text to analyze the content of Web documents. However, it is well known that humans often incorporate ”mental visualization” in the form of mental images in order to interpret text. Psychological studies have demonstrated that humans are able to create these visual perceptions even in absence of external visual stimuli. Such a physiological behavior in the human brain suggests that incorporating visual information to text analysis in the machines could be beneficial as well. In this paper, we present a method that incorporates the idea of mental images to learn a document similarity metric exploiting both text and visual data. The key idea behind our method is to associate a visual representation to the most salient portions of the original text. We do so by leveraging text-based image search engines: the photos retrieved using these textual queries can be viewed as akin to the mental images used by humans. We then combine these images with the original text representation in order to perform a joint analysis of text and visual data. In this paper we demonstrate this approach on the task of semantically clustering Web search results for a given query. Our experiments indicate that incorporating images leads to increased accuracy rates relative to systems relying on text only. In addition, our method learns a universal document similarity metric which can be successfully generalized to any queries and arbitrary documents, a property that provides an efficient framework to perform predictions at search time. Finally, we show that the concept of mental images can be successfully applied to different text domains, which suggests that our method can be generalized to many different tasks.
منابع مشابه
Geographically-aware Cross-media Retrieval for Associating Photos to Travelogues
Textual documents published on the Web where people describe traveling experiences, usually called travelogues, can provide interesting information about the experiences lived by the respective authors while traveling. Nowadays, several websites can be used for sharing these textual documents, and the use of Web information for travel planning has also increased. Still, the usage of the travelo...
متن کاملروش جدید متنکاوی برای استخراج اطلاعات زمینه کاربر بهمنظور بهبود رتبهبندی نتایج موتور جستجو
Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...
متن کاملText Retrieval from Document Images based on N-Gram Algorithm
In this paper, we propose a method of text retrieval from document images using a similarity measure based on an N-Gram algorithm. We directly extract image features instead of using optical character recognition. Character image objects are extracted from document images based on connected components first and then an unsupervised classifier is used to classify these objects. All objects are e...
متن کاملSimilarity measurement for describe user images in social media
Online social networks like Instagram are places for communication. Also, these media produce rich metadata which are useful for further analysis in many fields including health and cognitive science. Many researchers are using these metadata like hashtags, images, etc. to detect patterns of user activities. However, there are several serious ambiguities like how much reliable are these informa...
متن کاملLearning Document Image Features With SqueezeNet Convolutional Neural Network
The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...
متن کامل